NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

BINOMIAL GRADIENT-BASED META-LEARNING FOR ENHANCED META-GRADIENT ESTIMATION

Zhang, Yilang; Jaeger-Mountain, Abraham; Li, Bingcong; Giannakis, Georgios B (April 2026, International conference on learning and representation)

Full Text Available
RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models

Zhang, Yilang; Li, Bingcong; Giannakis, Georgios B (December 2025, Conference on Neural Information Processing Systems (NeurIPS 2025))

Full Text Available
Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm

Zhang, Yilang; Li, Bingcong; Giannakis, GB (April 2025, IEEE)

Full Text Available
Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm

https://doi.org/10.1109/ICASSP49660.2025.10889586

Zhang, Yilang; Li, Bingcong; Giannakis, Georgios B (April 2025, The Illinois labor letter)

Full Text Available
Zeroth-Order Optimization Finds Flat Minima

Zhang, Liang; Li, Bingcong; Thekumparampil, Kiran_Koshy; Oh, Sewoong; Muehlebach, Michael; He, Niao (June 2025, https://doi.org/10.48550/arXiv.2506.05454)

Zeroth-order methods are extensively used in machine learning applications where gradients are infeasible or expensive to compute, such as black-box attacks, reinforcement learning, and language model fine-tuning. Existing optimization theory focuses on convergence to an arbitrary stationary point, but less is known about the implicit regularization that provides a fine-grained characterization of which particular solutions are reached. This paper shows that zeroth-order optimization with the standard two-point estimator favors solutions with small trace of Hessian, a measure widely used to distinguish between sharp and flat minima. The authors provide convergence rates of zeroth-order optimization to approximate flat minima for convex and sufficiently smooth functions, defining flat minima as minimizers that achieve the smallest trace of Hessian among all optimal solutions. Experiments on binary classification tasks with convex losses and language model fine-tuning support the theoretical findings.
more » « less
Full Text Available
Meta-Learning with Versatile Loss Geometries for Fast Adaptation Using Mirror Descent

Zhang, Yilang; Li, Bingcong; Giannakis, Georgios B (April 2024, IEEE International Conference on Acoustics, Speech, and Signal Processing)

Full Text Available
Meta-Learning With Versatile Loss Geometries for Fast Adaptation Using Mirror Descent

https://doi.org/10.1109/ICASSP48485.2024.10448144

Zhang, Yilang; Li, Bingcong; Giannakis, Georgios B (April 2024, IEEE)

Utilizing task-invariant prior knowledge extracted from related tasks, meta-learning is a principled framework that empowers learning a new task especially when data records are limited. A fundamental challenge in meta-learning is how to quickly "adapt" the extracted prior in order to train a task-specific model within a few optimization steps. Existing approaches deal with this challenge using a preconditioner that enhances convergence of the per-task training process. Though effective in representing locally a quadratic training loss, these simple linear preconditioners can hardly capture complex loss geometries. The present contribution addresses this limitation by learning a nonlinear mirror map, which induces a versatile distance metric to enable capturing and optimizing a wide range of loss geometries, hence facilitating the per-task training. Numerical tests on few-shot learning datasets demonstrate the superior expressiveness and convergence of the advocated approach.
more » « less
Full Text Available
Enhancing Sharpness-Aware Optimization Through Variance Suppression

Li, Bingcong; Giannakis, Georgios B (December 2023, Proceedings of Neural Information Processing Systems (NeurIPS))

Full Text Available
Enhancing Sharpness-Aware Optimization Through Variance Suppression

Li, Bingcong; Giannakis, Georgios B (November 2023, Proceedings of Neural Information Processing Systems (NeurIPS))

Full Text Available
Conic Descent Redux for Memory-Efficient Optimization

https://doi.org/10.1109/IEEECONF59524.2023.10476894

Li, Bingcong; Giannakis, Georgios B (October 2023, IEEE)

Conic programming has well-documented merits in a gamut of signal processing and machine learning tasks. This contribution revisits a recently developed first-order conic descent (CD) solver, and advances it in three aspects: intuition, theory, and algorithmic implementation. It is found that CD can afford an intuitive geometric derivation that originates from the dual problem. This opens the door to novel algorithmic designs, with a momentum variant of CD, momentum conic descent (MOCO) exemplified. Diving deeper into the dual behavior CD and MOCO reveals: i) an analytically justified stopping criterion; and, ii) the potential to design preconditioners to speed up dual convergence. Lastly, to scale semidefinite programming (SDP) especially for low-rank solutions, a memory efficient MOCO variant is developed and numerically validated.
more » « less
Full Text Available

« Prev Next »

Search for: All records